ABHIDHA: An extended WordNet for Indo-Aryan Languages

نویسندگان

  • Shireesh Reddy Annam
  • Monojit Choudhury
  • Sudeshna Sarkar
  • Anupam Basu
چکیده

A lexical knowledge base is an important component of any intelligent information processing system. The WordNet developed at the Cognitive Systems Laboratories at Princeton has served as a lexical reference system for natural language processing activities. The Indian language based activities at our institute mainly in text-to-speech synthesis and natural language generation from iconic inputs require the inclusion of additional features in the lexical reference system like phonology, word roots and etymological information. Our initial efforts have been in Hindi and Bengali but commonality of Indo Aryan Languages and the importance of these extra features lead us to believe that it is a worthwhile effort to build-up a WordNet for other Indo-Aryan languages containing these features. In this paper we speak of the issues relating to the structured design and development of a generalized extended WordNet for Indo Aryan languages with special reference to Hindi and Bengali.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction to Gujarati wordnet

Gujarati is one of the 22 official languages of India. It is an Indo-Aryan language descended from Sanskrit. Gujarati wordnet is being built using expansion approach with Hindi as the source language. This paper describes experiences of building Gujarati wordnet. Paper discusses basic features of Gujarati language and evaluates suitability of Hindi language for expansion approach. Various issue...

متن کامل

Building a WordNet for Sinhala

Sinhala is one of the official languages of Sri Lanka and is used by over 19 million people. It belongs to the Indo-Aryan branch of the Indo-European languages and its origins date back to at least 2000 years. It has developed into its current form over a long period of time with influences from a wide variety of languages including Tamil, Portuguese and English. As for any other language, a Wo...

متن کامل

Dialects in the Indo-Aryan landscape

The Indo-Aryan language family currently occupies a significant region of the Indian subcontinent, its member languages being spoken in the bulk of North India, as well as in Pakistan, Bangladesh, Nepal, Sri Lanka, and the Maldives. The historical depth of the textual record and the geographical breadth of the Indo-Aryan linguistic area, the diversity of its languages (226 in all), and its many...

متن کامل

Why Indo-Aryan languages adapt English alveolars as reʈroflexes: Acoustic evidence from Punjabi

In Indo-Aryan languages, English loanwords containing the alveolar /t/ are always adapted as retroflex /ʈ/ [1]. It is argued that English alveolars share the cues of release burst with the retroflexes in Indo-Aryan languages [2]. However, no quantitative acoustic evidence is provided by [2] as to what acoustic cues of English alveolars are important for the speakers of Indo-Aryan languages to a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003